Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection

نویسندگان

چکیده

Integrating multispectral data in object detection, especially visible and infrared images, has received great attention recent years. Since (RGB) (IR) images can provide complementary information to handle light variations, the paired are used many fields, such as pedestrian RGB-IR crowd counting salient detection. Compared with natural we find detection aerial suffers from cross-modal weakly misalignment problems, which manifested position, size angle deviations of same object. In this paper, mainly address challenge images. Specifically, firstly explain analyze cause problem. Then, propose a Translation-Scale-Rotation Alignment (TSRA) module problem by calibrating feature maps these two modalities. The predicts deviation between modality objects through an alignment process utilizes Modality-Selection (MS) strategy improve performance alignment. Finally, two-stream detector (TSFADet) based on TSRA is constructed for With comprehensive experiments public DroneVehicle datasets, verify that our method reduces effect achieve robust results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Translation , Rotation and Scale Invariant Object

A method for object recognition invariant under translation , rotation and scaling is addressed. The rst step of the method (preprocessing) takes into account the invariant properties of the normalized moment of inertia and a novel coding that extracts topological object characteristics. The second step (recognition) is achieved by using a Holographic Nearest Neighbor algorithm (HNN), where vec...

متن کامل

ROTATION , SCALE AND TRANSLATION INVARIANT DIGITAL IMAGEWATERMARKINGJoseph

A digital watermark is an invisible mark embedded in a digital image which may be used for Copyright Protection. This paper describes how Fourier-Mellin transform-based invariants can be used for digital image watermarking. The embedded marks are designed to be unaaected by any comb ination of rotation, scale and translation transformations. The original image is not required for extracting the...

متن کامل

Rotation , Scale and Translation Invariant Digital

A digital watermark is an invisible mark embedded in a digital image which may be used for Copyright Protection. This paper describes how Fourier-Mellin transform-based invariants can be used for digital image watermarking. The embedded marks are designed to be unaaected by any combination of rotation, scale and translation transformations. The original image is not required for extracting the ...

متن کامل

Translation, rotation, and scale-invariant object recognition

A method for object recognition, invariant under translation, rotation, and scaling, is addressed. The first step of the method (preprocessing) takes into account the invariant properties of the normalized moment of inertia and a novel coding that extracts topological object characteristics. The second step (recognition) is achieved by using a holographic nearest-neighbor algorithm (HNN), in wh...

متن کامل

RGB-D Salient Object Detection Based on Discriminative Cross-modal Transfer Learning

In this work, we propose to utilize Convolutional Neural Networks (CNNs) to boost the performance of depth-induced salient object detection by capturing the high-level representative features for depth modality. We formulate the depth-induced saliency detection as a CNN-based cross-modal transfer problem to bridge the gap between the " data-hungry " nature of CNNs and the unavailability of suff...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2022

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-20077-9_30